Overview

Dataset statistics

Number of variables 14
Number of observations 20733
Missing cells 195236
Missing cells (%) 67.3%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 2.2 MiB
Average record size in memory 112.0 B

Variable types

DateTime 1
Categorical 4
Numeric 9

Dataset

Description Sensor that returns a label identifying the activity performed by the user, accurately detected using low power signals from multiple sensors in the device. This is achieved using Google’s Activity Recognition APIs. Possible activities are: still, in_vehicle, on_bycicle, on_foot, running, tilting, walking. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
Creator Matteo Busso, Massimo Stefan
Author Fausto Giunchiglia, Ivano Bison, Matteo Busso, Ronald Chenu-Abente, Marcelo Rodas Britez, Can Gunel, Giuseppe Veltri, Amalia de Götzen, Peter Kun, Amarsanaa Ganbold, Altangerel Chagnaa, George Gaskell, Miriam Bidoglia, Luca Cernuzzi, Alethia Hume, Jose Luis Zarza, Daniele Miorandi, Carlo Caprini
URL
Copyright (c) KnowDive 2022

Variable descriptions

experimentId Experiment Id
userId User id
timestamp show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
day day showing month(2), day(2)
label The activity name with highest accuracy
accuracy The highest accuracy for possible activities
InVehicle The value of the "in_vehicle" activity
OnBycicle The value of the "on_bycicle" activity
OnFoot The value of the "on_foot" activity
Running The value of the "running" activity
Still The value of the "still" activity
Unknown The value of the "unknown" activity
Walking The value of the "walking" activity
Tilting The value of the "tilting" activity

Alerts

experimentId has constant value "wenet" Constant
Tilting has constant value "100.0" Constant
accuracy is highly correlated with Still High correlation
OnBycicle is highly correlated with Unknown High correlation
OnFoot is highly correlated with Running and 1 other fields High correlation
Running is highly correlated with OnFoot and 2 other fields High correlation
Still is highly correlated with accuracy and 1 other fields High correlation
Unknown is highly correlated with OnBycicle and 2 other fields High correlation
Walking is highly correlated with OnFoot and 1 other fields High correlation
accuracy is highly correlated with Still and 1 other fields High correlation
OnFoot is highly correlated with Walking High correlation
Still is highly correlated with accuracy and 1 other fields High correlation
Unknown is highly correlated with accuracy and 1 other fields High correlation
Walking is highly correlated with OnFoot High correlation
accuracy is highly correlated with Still High correlation
OnBycicle is highly correlated with Unknown High correlation
OnFoot is highly correlated with Running and 1 other fields High correlation
Running is highly correlated with OnFoot and 1 other fields High correlation
Still is highly correlated with accuracy and 1 other fields High correlation
Unknown is highly correlated with OnBycicle and 1 other fields High correlation
Walking is highly correlated with OnFoot and 1 other fields High correlation
Unknown is highly correlated with experimentId and 1 other fields High correlation
experimentId is highly correlated with Unknown and 2 other fields High correlation
label is highly correlated with experimentId and 1 other fields High correlation
Tilting is highly correlated with Unknown and 2 other fields High correlation
userId is highly correlated with day High correlation
day is highly correlated with userId High correlation
label is highly correlated with accuracy and 6 other fields High correlation
accuracy is highly correlated with label and 5 other fields High correlation
InVehicle is highly correlated with label and 2 other fields High correlation
OnBycicle is highly correlated with label High correlation
OnFoot is highly correlated with label and 5 other fields High correlation
Running is highly correlated with OnFoot and 1 other fields High correlation
Still is highly correlated with label and 5 other fields High correlation
Unknown is highly correlated with label and 3 other fields High correlation
Walking is highly correlated with label and 4 other fields High correlation
experimentId has 12799 (61.7%) missing values Missing
userId has 12799 (61.7%) missing values Missing
day has 12799 (61.7%) missing values Missing
label has 12799 (61.7%) missing values Missing
accuracy has 12799 (61.7%) missing values Missing
InVehicle has 16280 (78.5%) missing values Missing
OnBycicle has 17042 (82.2%) missing values Missing
OnFoot has 16494 (79.6%) missing values Missing
Running has 17904 (86.4%) missing values Missing
Still has 13023 (62.8%) missing values Missing
Unknown has 15043 (72.6%) missing values Missing
Walking has 16494 (79.6%) missing values Missing
Tilting has 18961 (91.5%) missing values Missing
timestamp has unique values Unique
userId has 308 (1.5%) zeros Zeros

Reproduction

Analysis started 2022-07-04 18:04:00.392764
Analysis finished 2022-07-04 18:04:22.227034
Duration 21.83 seconds
Software version pandas-profiling v3.2.0
Download configuration config.json

Variables

timestamp
Date

UNIQUE

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct 20733
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 162.1 KiB
Minimum 1900-11-22 03:46:00
Maximum 1900-12-06 13:18:00
2022-07-04T20:04:22.371732 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:22.681245 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

experimentId
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Experiment Id

Distinct 1
Distinct (%) < 0.1%
Missing 12799
Missing (%) 61.7%
Memory size 162.1 KiB
wenet
7934

Length

Max length 5
Median length 5
Mean length 5
Min length 5

Characters and Unicode

Total characters 39670
Distinct characters 4
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row wenet
2nd row wenet
3rd row wenet
4th row wenet
5th row wenet

Common Values

Value Count Frequency (%)
wenet 7934
38.3%
(Missing) 12799
61.7%

Length

2022-07-04T20:04:22.958556 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:04:23.179441 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
wenet 7934
100.0%

Most occurring characters

Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 39670
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

Most occurring scripts

Value Count Frequency (%)
Latin 39670
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 39670
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 15868
40.0%
w 7934
20.0%
n 7934
20.0%
t 7934
20.0%

userId
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING
ZEROS

User id

Distinct 6
Distinct (%) 0.1%
Missing 12799
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 8.418326191
Minimum 0
Maximum 14
Zeros 308
Zeros (%) 1.5%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:23.331645 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 4
Q1 4
median 11
Q3 13
95-th percentile 13
Maximum 14
Range 14
Interquartile range (IQR) 9

Descriptive statistics

Standard deviation 4.721469017
Coefficient of variation (CV) 0.5608560312
Kurtosis -1.777878199
Mean 8.418326191
Median Absolute Deviation (MAD) 3
Skewness -0.1352946335
Sum 66791
Variance 22.29226968
Monotonicity Not monotonic
2022-07-04T20:04:23.514754 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
Value Count Frequency (%)
13 3619
17.5%
4 3513
16.9%
0 308
1.5%
11 239
1.2%
14 216
1.0%
1 39
0.2%
(Missing) 12799
61.7%
Value Count Frequency (%)
0 308
1.5%
1 39
0.2%
4 3513
16.9%
11 239
1.2%
13 3619
17.5%
14 216
1.0%
Value Count Frequency (%)
14 216
1.0%
13 3619
17.5%
11 239
1.2%
4 3513
16.9%
1 39
0.2%
0 308
1.5%

day
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

day showing month(2), day(2)

Distinct 15
Distinct (%) 0.2%
Missing 12799
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 1153.450718
Minimum 1122
Maximum 1206
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:23.736876 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1122
5-th percentile 1122
Q1 1125
median 1129
Q3 1202
95-th percentile 1204
Maximum 1206
Range 84
Interquartile range (IQR) 77

Descriptive statistics

Standard deviation 36.90225817
Coefficient of variation (CV) 0.0319929214
Kurtosis -1.647152659
Mean 1153.450718
Median Absolute Deviation (MAD) 6
Skewness 0.5801259479
Sum 9151478
Variance 1361.776658
Monotonicity Increasing
2022-07-04T20:04:23.936110 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
Value Count Frequency (%)
1201 814
3.9%
1130 808
3.9%
1125 748
3.6%
1124 740
3.6%
1123 737
3.6%
1202 707
3.4%
1127 603
2.9%
1204 573
2.8%
1129 466
2.2%
1122 446
2.2%
Other values (5) 1292
6.2%
(Missing) 12799
61.7%
Value Count Frequency (%)
1122 446
2.2%
1123 737
3.6%
1124 740
3.6%
1125 748
3.6%
1126 299
1.4%
1127 603
2.9%
1128 240
1.2%
1129 466
2.2%
1130 808
3.9%
1201 814
3.9%
Value Count Frequency (%)
1206 77
0.4%
1205 308
1.5%
1204 573
2.8%
1203 368
1.8%
1202 707
3.4%
1201 814
3.9%
1130 808
3.9%
1129 466
2.2%
1128 240
1.2%
1127 603
2.9%

label
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

The activity name with highest accuracy

Distinct 6
Distinct (%) 0.1%
Missing 12799
Missing (%) 61.7%
Memory size 162.1 KiB
Still
4948
Unknown
1171
Tilting
1093
OnFoot
453
InVehicle
260

Length

Max length 9
Median length 5
Mean length 5.763423242
Min length 5

Characters and Unicode

Total characters 45727
Distinct characters 20
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Tilting
2nd row Tilting
3rd row Still
4th row Still
5th row Still

Common Values

Value Count Frequency (%)
Still 4948
23.9%
Unknown 1171
5.6%
Tilting 1093
5.3%
OnFoot 453
2.2%
InVehicle 260
1.3%
OnBycicle 9
< 0.1%
(Missing) 12799
61.7%

Length

2022-07-04T20:04:24.176453 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:04:24.447673 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
still 4948
62.4%
unknown 1171
14.8%
tilting 1093
13.8%
onfoot 453
5.7%
invehicle 260
3.3%
onbycicle 9
0.1%

Most occurring characters

Value Count Frequency (%)
l 11258
24.6%
i 7403
16.2%
t 6494
14.2%
n 5328
11.7%
S 4948
10.8%
o 2077
4.5%
U 1171
2.6%
k 1171
2.6%
w 1171
2.6%
g 1093
2.4%
Other values (10) 3613
7.9%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 37071
81.1%
Uppercase Letter 8656
18.9%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
l 11258
30.4%
i 7403
20.0%
t 6494
17.5%
n 5328
14.4%
o 2077
5.6%
k 1171
3.2%
w 1171
3.2%
g 1093
2.9%
e 529
1.4%
c 278
0.7%
Other values (2) 269
0.7%
Uppercase Letter
Value Count Frequency (%)
S 4948
57.2%
U 1171
13.5%
T 1093
12.6%
O 462
5.3%
F 453
5.2%
I 260
3.0%
V 260
3.0%
B 9
0.1%

Most occurring scripts

Value Count Frequency (%)
Latin 45727
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
l 11258
24.6%
i 7403
16.2%
t 6494
14.2%
n 5328
11.7%
S 4948
10.8%
o 2077
4.5%
U 1171
2.6%
k 1171
2.6%
w 1171
2.6%
g 1093
2.4%
Other values (10) 3613
7.9%

Most occurring blocks

Value Count Frequency (%)
ASCII 45727
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
l 11258
24.6%
i 7403
16.2%
t 6494
14.2%
n 5328
11.7%
S 4948
10.8%
o 2077
4.5%
U 1171
2.6%
k 1171
2.6%
w 1171
2.6%
g 1093
2.4%
Other values (10) 3613
7.9%

accuracy
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The highest accuracy for possible activities

Distinct 75
Distinct (%) 0.9%
Missing 12799
Missing (%) 61.7%
Infinite 0
Infinite (%) 0.0%
Mean 86.34257625
Minimum 26
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:24.721947 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 26
5-th percentile 40
Q1 86
median 99
Q3 100
95-th percentile 100
Maximum 100
Range 74
Interquartile range (IQR) 14

Descriptive statistics

Standard deviation 22.86123216
Coefficient of variation (CV) 0.26477357
Kurtosis 0.1052264047
Mean 86.34257625
Median Absolute Deviation (MAD) 1
Skewness -1.385442076
Sum 685042
Variance 522.6359357
Monotonicity Not monotonic
2022-07-04T20:04:24.996505 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 3498
16.9%
99 1195
5.8%
40 1183
5.7%
96 336
1.6%
97 254
1.2%
98 248
1.2%
93 59
0.3%
92 53
0.3%
91 51
0.2%
94 48
0.2%
Other values (65) 1009
4.9%
(Missing) 12799
61.7%
Value Count Frequency (%)
26 1
< 0.1%
27 2
< 0.1%
28 2
< 0.1%
29 5
< 0.1%
30 3
< 0.1%
31 3
< 0.1%
32 3
< 0.1%
33 3
< 0.1%
34 4
< 0.1%
35 7
< 0.1%
Value Count Frequency (%)
100 3498
16.9%
99 1195
5.8%
98 248
1.2%
97 254
1.2%
96 336
1.6%
95 41
0.2%
94 48
0.2%
93 59
0.3%
92 53
0.3%
91 51
0.2%

InVehicle
Real number (ℝ ≥0 )

HIGH CORRELATION
MISSING

The value of the "in_vehicle" activity

Distinct 71
Distinct (%) 1.6%
Missing 16280
Missing (%) 78.5%
Infinite 0
Infinite (%) 0.0%
Mean 15.11250842
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:25.296887 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 3
median 10
Q3 10
95-th percentile 89
Maximum 98
Range 97
Interquartile range (IQR) 7

Descriptive statistics

Standard deviation 22.18792072
Coefficient of variation (CV) 1.468182522
Kurtosis 7.270027849
Mean 15.11250842
Median Absolute Deviation (MAD) 4
Skewness 2.862919214
Sum 67296
Variance 492.303826
Monotonicity Not monotonic
2022-07-04T20:04:25.587677 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 1905
9.2%
1 736
3.5%
2 284
1.4%
3 106
0.5%
4 86
0.4%
6 73
0.4%
5 69
0.3%
96 61
0.3%
23 56
0.3%
9 53
0.3%
Other values (61) 1024
4.9%
(Missing) 16280
78.5%
Value Count Frequency (%)
1 736
3.5%
2 284
1.4%
3 106
0.5%
4 86
0.4%
5 69
0.3%
6 73
0.4%
7 43
0.2%
8 48
0.2%
9 53
0.3%
10 1905
9.2%
Value Count Frequency (%)
98 20
0.1%
97 47
0.2%
96 61
0.3%
95 15
0.1%
94 15
0.1%
93 21
0.1%
92 17
0.1%
91 15
0.1%
90 10
< 0.1%
89 12
0.1%

OnBycicle
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "on_bycicle" activity

Distinct 50
Distinct (%) 1.4%
Missing 17042
Missing (%) 82.2%
Infinite 0
Infinite (%) 0.0%
Mean 7.695204552
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:25.890648 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 3
median 10
Q3 10
95-th percentile 10
Maximum 98
Range 97
Interquartile range (IQR) 7

Descriptive statistics

Standard deviation 7.01975735
Coefficient of variation (CV) 0.9122249192
Kurtosis 66.4281333
Mean 7.695204552
Median Absolute Deviation (MAD) 0
Skewness 6.409381599
Sum 28403
Variance 49.27699326
Monotonicity Not monotonic
2022-07-04T20:04:26.381567 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 2012
9.7%
1 584
2.8%
2 328
1.6%
3 206
1.0%
4 128
0.6%
5 86
0.4%
6 66
0.3%
7 41
0.2%
8 37
0.2%
9 27
0.1%
Other values (40) 176
0.8%
(Missing) 17042
82.2%
Value Count Frequency (%)
1 584
2.8%
2 328
1.6%
3 206
1.0%
4 128
0.6%
5 86
0.4%
6 66
0.3%
7 41
0.2%
8 37
0.2%
9 27
0.1%
10 2012
9.7%
Value Count Frequency (%)
98 1
< 0.1%
94 1
< 0.1%
92 1
< 0.1%
88 1
< 0.1%
87 1
< 0.1%
85 1
< 0.1%
84 1
< 0.1%
83 1
< 0.1%
82 1
< 0.1%
80 1
< 0.1%

OnFoot
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "on_foot" activity

Distinct 68
Distinct (%) 1.6%
Missing 16494
Missing (%) 79.6%
Infinite 0
Infinite (%) 0.0%
Mean 20.844067
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:26.678917 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 6
median 10
Q3 11
95-th percentile 94
Maximum 98
Range 97
Interquartile range (IQR) 5

Descriptive statistics

Standard deviation 29.47129657
Coefficient of variation (CV) 1.413893775
Kurtosis 1.847045072
Mean 20.844067
Median Absolute Deviation (MAD) 2
Skewness 1.894742966
Sum 88358
Variance 868.5573214
Monotonicity Not monotonic
2022-07-04T20:04:26.970309 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 1926
9.3%
1 501
2.4%
2 144
0.7%
4 119
0.6%
3 112
0.5%
5 98
0.5%
96 94
0.5%
6 87
0.4%
11 69
0.3%
8 64
0.3%
Other values (58) 1025
4.9%
(Missing) 16494
79.6%
Value Count Frequency (%)
1 501
2.4%
2 144
0.7%
3 112
0.5%
4 119
0.6%
5 98
0.5%
6 87
0.4%
7 62
0.3%
8 64
0.3%
9 52
0.3%
10 1926
9.3%
Value Count Frequency (%)
98 32
0.2%
97 52
0.3%
96 94
0.5%
95 31
0.1%
94 33
0.2%
93 40
0.2%
92 41
0.2%
91 37
0.2%
90 29
0.1%
89 26
0.1%

Running
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "running" activity

Distinct 29
Distinct (%) 1.0%
Missing 17904
Missing (%) 86.4%
Infinite 0
Infinite (%) 0.0%
Mean 8.675503712
Minimum 1
Maximum 91
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:27.243945 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 10
median 10
Q3 10
95-th percentile 10
Maximum 91
Range 90
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 5.688088761
Coefficient of variation (CV) 0.6556493951
Kurtosis 83.45415677
Mean 8.675503712
Median Absolute Deviation (MAD) 0
Skewness 6.91572269
Sum 24543
Variance 32.35435375
Monotonicity Not monotonic
2022-07-04T20:04:27.464978 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
Value Count Frequency (%)
10 2215
10.7%
1 345
1.7%
2 106
0.5%
3 62
0.3%
4 29
0.1%
5 19
0.1%
6 13
0.1%
7 7
< 0.1%
8 4
< 0.1%
11 3
< 0.1%
Other values (19) 26
0.1%
(Missing) 17904
86.4%
Value Count Frequency (%)
1 345
1.7%
2 106
0.5%
3 62
0.3%
4 29
0.1%
5 19
0.1%
6 13
0.1%
7 7
< 0.1%
8 4
< 0.1%
9 1
< 0.1%
10 2215
10.7%
Value Count Frequency (%)
91 1
< 0.1%
88 1
< 0.1%
85 1
< 0.1%
81 1
< 0.1%
76 1
< 0.1%
74 1
< 0.1%
69 1
< 0.1%
68 1
< 0.1%
63 1
< 0.1%
57 2
< 0.1%

Still
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "still" activity

Distinct 91
Distinct (%) 1.2%
Missing 13023
Missing (%) 62.8%
Infinite 0
Infinite (%) 0.0%
Mean 68.32594034
Minimum 1
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:27.727770 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 3
Q1 10
median 98
Q3 100
95-th percentile 100
Maximum 100
Range 99
Interquartile range (IQR) 90

Descriptive statistics

Standard deviation 40.57475563
Coefficient of variation (CV) 0.5938411594
Kurtosis -1.383513007
Mean 68.32594034
Median Absolute Deviation (MAD) 2
Skewness -0.6903562138
Sum 526793
Variance 1646.310794
Monotonicity Not monotonic
2022-07-04T20:04:28.019549 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
100 2560
12.3%
10 1673
8.1%
99 1281
6.2%
98 240
1.2%
96 237
1.1%
1 237
1.1%
97 194
0.9%
2 112
0.5%
3 52
0.3%
44 50
0.2%
Other values (81) 1074
5.2%
(Missing) 13023
62.8%
Value Count Frequency (%)
1 237
1.1%
2 112
0.5%
3 52
0.3%
4 31
0.1%
5 22
0.1%
6 33
0.2%
7 10
< 0.1%
8 15
0.1%
9 11
0.1%
10 1673
8.1%
Value Count Frequency (%)
100 2560
12.3%
99 1281
6.2%
98 240
1.2%
97 194
0.9%
96 237
1.1%
95 15
0.1%
94 13
0.1%
93 15
0.1%
92 11
0.1%
91 11
0.1%

Unknown
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "unknown" activity

Distinct 5
Distinct (%) 0.1%
Missing 15043
Missing (%) 72.6%
Memory size 162.1 KiB
1.0
2381
40.0
1726
2.0
1296
3.0
280
4.0
7

Length

Max length 4
Median length 3
Mean length 3.303339192
Min length 3

Characters and Unicode

Total characters 18796
Distinct characters 6
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 40.0
2nd row 40.0
3rd row 1.0
4th row 1.0
5th row 1.0

Common Values

Value Count Frequency (%)
1.0 2381
11.5%
40.0 1726
8.3%
2.0 1296
6.3%
3.0 280
1.4%
4.0 7
< 0.1%
(Missing) 15043
72.6%

Length

2022-07-04T20:04:28.294163 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:04:28.540164 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
1.0 2381
41.8%
40.0 1726
30.3%
2.0 1296
22.8%
3.0 280
4.9%
4.0 7
0.1%

Most occurring characters

Value Count Frequency (%)
0 7416
39.5%
. 5690
30.3%
1 2381
12.7%
4 1733
9.2%
2 1296
6.9%
3 280
1.5%

Most occurring categories

Value Count Frequency (%)
Decimal Number 13106
69.7%
Other Punctuation 5690
30.3%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 7416
56.6%
1 2381
18.2%
4 1733
13.2%
2 1296
9.9%
3 280
2.1%
Other Punctuation
Value Count Frequency (%)
. 5690
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 18796
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
0 7416
39.5%
. 5690
30.3%
1 2381
12.7%
4 1733
9.2%
2 1296
6.9%
3 280
1.5%

Most occurring blocks

Value Count Frequency (%)
ASCII 18796
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
0 7416
39.5%
. 5690
30.3%
1 2381
12.7%
4 1733
9.2%
2 1296
6.9%
3 280
1.5%

Walking
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

The value of the "walking" activity

Distinct 61
Distinct (%) 1.4%
Missing 16494
Missing (%) 79.6%
Infinite 0
Infinite (%) 0.0%
Mean 20.68766218
Minimum 1
Maximum 98
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.1 KiB
2022-07-04T20:04:28.791779 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 1
Q1 6
median 10
Q3 11
95-th percentile 94
Maximum 98
Range 97
Interquartile range (IQR) 5

Descriptive statistics

Standard deviation 29.33925685
Coefficient of variation (CV) 1.418200693
Kurtosis 1.940844507
Mean 20.68766218
Median Absolute Deviation (MAD) 2
Skewness 1.918739928
Sum 87695
Variance 860.7919926
Monotonicity Not monotonic
2022-07-04T20:04:29.083570 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
10 1927
9.3%
1 501
2.4%
2 144
0.7%
4 119
0.6%
3 112
0.5%
5 98
0.5%
96 94
0.5%
6 87
0.4%
11 70
0.3%
8 65
0.3%
Other values (51) 1022
4.9%
(Missing) 16494
79.6%
Value Count Frequency (%)
1 501
2.4%
2 144
0.7%
3 112
0.5%
4 119
0.6%
5 98
0.5%
6 87
0.4%
7 63
0.3%
8 65
0.3%
9 53
0.3%
10 1927
9.3%
Value Count Frequency (%)
98 32
0.2%
97 52
0.3%
96 94
0.5%
95 31
0.1%
94 33
0.2%
93 40
0.2%
92 41
0.2%
91 36
0.2%
90 29
0.1%
89 26
0.1%

Tilting
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

The value of the "tilting" activity

Distinct 1
Distinct (%) 0.1%
Missing 18961
Missing (%) 91.5%
Memory size 162.1 KiB
100.0
1772

Length

Max length 5
Median length 5
Mean length 5
Min length 5

Characters and Unicode

Total characters 8860
Distinct characters 3
Distinct categories 2 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row 100.0
2nd row 100.0
3rd row 100.0
4th row 100.0
5th row 100.0

Common Values

Value Count Frequency (%)
100.0 1772
8.5%
(Missing) 18961
91.5%

Length

2022-07-04T20:04:29.335545 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:04:29.548019 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
100.0 1772
100.0%

Most occurring characters

Value Count Frequency (%)
0 5316
60.0%
1 1772
20.0%
. 1772
20.0%

Most occurring categories

Value Count Frequency (%)
Decimal Number 7088
80.0%
Other Punctuation 1772
20.0%

Most frequent character per category

Decimal Number
Value Count Frequency (%)
0 5316
75.0%
1 1772
25.0%
Other Punctuation
Value Count Frequency (%)
. 1772
100.0%

Most occurring scripts

Value Count Frequency (%)
Common 8860
100.0%

Most frequent character per script

Common
Value Count Frequency (%)
0 5316
60.0%
1 1772
20.0%
. 1772
20.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 8860
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
0 5316
60.0%
1 1772
20.0%
. 1772
20.0%

Interactions

2022-07-04T20:04:18.089261 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:01.572048 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:03.797569 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:05.769361 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:07.847157 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:10.057542 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:12.062581 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:14.044933 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:16.164009 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:18.331730 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:01.987831 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:04.035302 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:06.008059 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:08.278179 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:10.292244 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:12.295200 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:14.468812 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:16.391046 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:18.547253 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:02.212127 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:04.251430 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:06.238006 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:08.490231 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:10.507197 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:12.508621 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:14.677825 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:16.597571 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:18.782023 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:02.452903 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:04.483827 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:06.483328 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:08.725831 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:10.743759 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:12.742671 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:14.904418 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:16.821934 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:19.005828 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:02.681702 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:04.700380 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:06.712103 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:08.944398 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:10.965933 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:12.959884 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:15.118832 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:17.029722 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:19.228575 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:02.906235 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:04.915047 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:06.941783 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:09.165698 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:11.190508 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:13.183645 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:15.327816 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:17.242035 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:19.452723 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:03.129386 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:05.130037 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:07.173187 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:09.389150 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:11.409025 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:13.399269 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:15.535543 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:17.456228 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:19.666883 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:03.346668 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:05.342900 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:07.397309 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:09.607056 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:11.620165 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:13.611423 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:15.734231 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:17.661698 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:19.886271 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:03.571366 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:05.558075 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:07.625153 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:09.830809 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:11.837368 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:13.827962 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:15.942834 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:04:17.874560 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-07-04T20:04:29.711822 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient ( ρ ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r . It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y , one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-04T20:04:30.048235 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient ( r ) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r .

To calculate r for two variables X and Y , one divides the covariance of X and Y by the product of their standard deviations.
2022-07-04T20:04:30.397598 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient ( τ ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y , one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-04T20:04:30.722305 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here .
2022-07-04T20:04:31.157770 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here .

Missing values

2022-07-04T20:04:20.456832 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-04T20:04:21.035007 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-04T20:04:21.522318 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-04T20:04:22.033849 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.